Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Text-to-image synthesis method based on multi-level progressive resolution generative adversarial networks

XU Yining, HE Xiaohai, ZHANG Jin, QING Linbo

Journal of Computer Applications 2020, 40 (12): 3612-3617. DOI: 10.11772/j.issn.1001-9081.2020040575

Abstract （347）

PDF （1238KB）（352）

Save

To address the problem that the results of text-to-image synthesis tasks have wrong target structures and unclear image textures, a Multi-level Progressive Resolution Generative Adversarial Network (MPRGAN) model was proposed based on Attentional Generative Adversarial Network (AttnGAN). Firstly, a semantic separation-fusion generation module was used in low-resolution layer, and the text feature was separated into three feature vectors by the guidance of self-attention mechanism and the feature vectors were used to generate feature maps respectively. Then, the feature maps were fused into low-resolution map, and the mask images were used as semantic constraints to improve the stability of the low-resolution generator. Finally, the progressive resolution residual structure was adopted in high-resolution layers. At the same time, the word attention mechanism and pixel shuffle were combined to further improve the quality of the generated images. Experimental results showed that, the Inception Score (IS) of the proposed model reaches 4.70 and 3.53 respectively on datasets of Caltech-UCSD Birds-200-2011 (CUB-200-2011) and 102 category flower dataset (Oxford-102), which are 7.80% and 3.82% higher than those of AttnGAN, respectively. The MPRGAN model can solve the instability problem of structure generation to a certain extent, and the images generated by the proposed model is closer to the real images.

Reference | Related Articles | Metrics